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An infinte word w avoids a pattern p with the involution 9 if there is no substitution for the variables 
in p and no involution such that the resulting word is a factor of w. We investigate the avoidance of 
patterns with respect to the size of the alphabet. For example, it is shown that the pattern a 6 (a) a 
can be avoided over three letters but not two letters, whereas it is well known that a a a is avoidable 
over two letters. 

1 Introduction 

The avoidability of patterns in infinite words is an old area of interest with a first systematic study going 
back to Thue (5J|6l. This field includes rediscoveries and studies by many authors over the last one 
hundred years; see for example |2l and HI for surveys. In this article, we are concerned with a variation 
of the theme by considering avoidable patterns with involution. An involution 6 is a mapping such 
that d 2 is the identity. We consider morphic, where 6(uv) = 6(u)8(v), and antimorphic involutions, 
where 9(uv) = 6(v)6(u). The subject of this article draws quite some motivation from applications in 
biology where the Watson-Crick complement corresponds to an antimorphic involution in our case. Our 
considerations are more general, however, by considering any alphabet size and also morphic involutions. 

During the review phase of this article, James Currie [3 ] presented a solution for all those patterns 
under involution in {a, 6(a)}* that we do not consider here, which leads to a characterization of the 
avoidance index for all unary patterns under involution. 

2 Preliminaries 

Our notation is guided by what is commonly found in the literature, see for example the first chapter 
of H as a reference. Let £ be a finite alphabet of letters and E* denote all finite and E ffl denote all (right-) 
infinite words over E. Let e denote the empty word. Letters are usually denoted by a, b, or c, and words 
over E are usually denoted by u, v, or w in this paper. The i-th letter of a word w is denoted by wuu that 
is, w = vvmWra • • • Wy if w is finite, and the length n of w is denoted by \w\ as usual. 

Besides E we need another finite set E of symbols. The elements of E are called variables and we 
usually denote them by a, j8, or y. Words in E* are called patterns. For example apa G E* is a pattern 
consisting of the variables a and j3 in E. We assign to every pattern a pattern language over the alphabet 
£. This language contains every word, that can be generated by substituting all variables in the pattern 
by non-empty words in E*. For example the pattern language of the pattern aa over E = {a,b} is 
{ aa,bb, aaaa,abab, baba,bbbb, . . .}. 

We say that a word w avoids a pattern, if no factor of w exists, that is in the pattern language. On the 
other hand, if a factor of w is an element of the pattern language, we say w contains the pattern. If for a 
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given pattern e and an alphabet £ with k elements a word w£l ffl exists that avoids e, then we say that e 
is k-avoidable. Otherwise we call e k-unavoidable. We call k £ N the avoidance index y(e) of a pattern 
e £ £"\ if e is ^-avoidable and & is minimal. If no such k exists, we define y(e) = °°. 

Let /: {a,b}* — > {a,b}* with a i-> ab and ft i-> ba. The fixpoint ? = lim^oo f k (a) exists and is 
called Thue-Morse word. The following result is a classical one. 
Theorem 1 (BUI). 77ie Thue-Morse word avoids the patterns aaa and apapa. 

3 Patterns with Involution 

For introducing patterns with involution, we extend the set of pattern variables E by adding 6(a) for all 
variables a £ E and some involution 6. For the rest of the article, we will stick to this definition of E. 
Given a morphic or antimorphic involution, we build the corresponding pattern language by replacing 
the variables by non-empty words and, for variables of the form 8(a), by applying the involution after 
the substitution. 

For example, let 6 be the morphic involution with a h-> b and b i-> a over £ = {a,b}, and let the 
pattern be a 6(a). We get the pattern language {ab,ba,aabb,abba,baab,bbaa, . . . }. Every word in 
{a,b} a \ (a^Ub®) contains the pattern a 6(a) for the morphic involution 8 with a^b and b\-t a. 
Observation 2. Let 6 be a morphic or antimorphic involution and not the identity or reversal mapping. 
Then every pattern, that contains variables of the a and d(a), is avoidable. 

Indeed, since 6 is not the identity or reversal mapping, a letter a£l with 6(a) ^ a exists. Therefore 
w = a a avoids every pattern that includes variables a and 6(a). 

Because of this observation we do not have to examine, if patterns are avoidable or unavoidable for 
a given involution. So we now change the point of view. For a given pattern e £ E* , we either look at 
all morphic or all antimorphic involutions £* — > Z* at the same time. So, we examine, for example, if an 
infinite word w £ L m exists, that avoids a pattern e for all morphic involutions. 

Definition 3. Let e £ E* be a pattern, possibly with variables of the form 6(a). We call k £ N the 
morphic (antimorphic) B-avoidance index y^(e) (i^(e)) of e £ E*, if an infinite word w £ Z" 5 over L 
with |£| = k exists, that avoids the pattern efor all morphic (antimorphic) involutions £* —> £* and k is 
minimal. If this doesn't hold for any k £ N, we define "f^(e) = °° (Y^e) = °°). 

We establish the first facts about avoidance of pattern a 6 (a) a. 
Lemma 4. Let ILbe a binary alphabet. Then there is no word w £ r ffl , that avoids the pattern a 6(a) a 
for all morphic involutions 6 : £* — > I*. That is, f m e (a0(a) a) > 2. 

Proof. Let £ = {a,b}. We try to construct a word w £ L w , that avoids e = a 6(a) a for all morphic 
involutions and bring this to a contradiction. For example, this word must not contain aaa, bbb, aba or 
bab as a factor. Without loss of generality w begins with a. 

Case 1: Assumed the word w begins with ab. Then this prefix must be followed by b, abb < p w. The 
next letter must be an a, the fifth must be an a too. So we have abbaa < p w. If the following letter is 
an a, aaa is a factor of w. So the next letter must be the letter b. But for the morphic involution 6 with 
a^-b and b h-> a the word abd(ab)ab is a factor of w. 

Case 2: The argument for the case aa < p w is analogous to case 1. □ 

The proof of the following lemma is analogous to the previous one. 
Lemma 5. Let Y, be a binary alphabet. There is no word w £ Z 05 , that avoids the pattern ad (a) a for 
all antimorphic involutions 6 : £* — > E*. That is, (ad (a) a) > 2. 
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4 Main Result 

In this section, we establish the 0-avoidance indices for the pattern a 6(a) a in the morphic and anti- 
morphic case. We start with the morphic case. 

Theorem 6. It holds that f£ (ad (a) a) = 3. 

Proof. Let £ an alphabet with three elements, £ = {a,b,c}. Let v be the infinitely long Thue-Morse 
word over the letters a' and b'. Furthermore let w G L m be the word, that is the outcome of replacing 
every a' in v by aacb and b 1 by accb. We will show, that w avoids the pattern ad (a) a for all morphic 
involutions. For better readability, we define x = aacb and y = accb. 

We assume it exists a morphic involution 6 and a substitution for a, such that ad(a)a is a factor of 
w. Proof by contradiction. First, we examine the possibilities of replacing the variable a by words u G Z + 
of length |m| < 7. The word uQ(u)u has a maximal length of 18. Therefore there must exist a morphic 
involution so that u B(u) u is a factor of a word w' G {x,y } . Because of TheoremQ] the words xxx, yyy, 
xyxyx and yxyxy can not be a factor of w'. A computer program can easily check these finite possibilities 
with the result, that no words u and w' exist, which fulfill the conditions. Now we assume a gets replaced 
by a word u G T + with \u\ > 7. Then, the word u contains aacb or accb. Without loss of generality, u 
contains aacb. Therefore, Q(u) contains the factor Q(aac) = 6(a) 6(a) 6(c). In addition Q(u) and for 
this reason 6(a) 6(a) 6(c) is a factor of w. There are only two possibilities for two succeeding identical 
letters in w. Either these letters are two letters c followed by the letter b, or two letters a are followed by 
the letter c. This implies, that u8(u)u can only be a factor of w, if 6 is the identity mapping. Furthermore 
this implies \u\ = 4-k for a k G N. This is visualized in Fig. [Q where Wi,Wii,Wi» € {*,.y} holds for all 
< i < k. If the word (wo)™ (wo)pi (wo)r 4 ] or (w>o)m (wc^pi (wo)[3i ( w o)[4] = wo is a prefix of the first u in 
Fig.Q] then the following equations apply: 

WO = VV(y = Wo" 
Wl = W\i = W\" 

W k -l = W k -V = Wk-l" 

The word wqW\ ...W£_iWo'Wi'---W;t-i'Wo''Wi''---Wifc-i'' = (wqm>\ . . .Wk-if is a factor of w. Because 
of Wi G {*,;y} for all < i < k — 1, this is a contradiction to Lemma [T] On the other hand, if only 
( w o)[3](wo)[4] or (wo)[4] is a prefix of u, then wo / wo' is possible. But in this case (w*")[i]( w *")[2] or 
( w *")[i](w*")[2]( w *")[3] is a suffix of the third u. This implies 

Wl = Wl' = Wl" 
W2 = W2' = W2" 



Wk = Wk' = Wk" 

and W1W2 . . . Wk w\i\V2> ■ ■ ■ w\»W2» ■ ■ ■ Wk» = (w\W2 ■ ■ ■ Wkf is a factor of w. Again, this is a contradiction 
to LemmaQ] The theorem follows with Lemma[4] □ 

The result of Theorem [6] transfers also to the antimorphic case. 
Theorem 7. It holds that 1 / ^(aQ(a)a) = 3. 
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Figure 1 : Part of w to illustrate the factor uuu 
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Figure 2: Part of w and the factor u of w 



Proof. This proof follows the proof of the previous theorem. Let £ be an alphabet with three elements, 
£ = {a,b,c}. Further, let v be the Thue-Morse word over the letters a' and b'. Let w E L m be the word, 
that we get by replacing a' in v by aabbc and b' by aaccb. We will show, that w avoids the pattern 
a 6(a) a for all antimorphic involutions. For better readability, we define x = aabbc and y = aaccb. 

We assume that there exists an antimorphic involution and a substitution of a by a word u G £ + in 
such a way, that u 6(u) u is a factor of w. First we suppose that \u\ < 9 holds. The word ud(u)u then has 
a maximal length of 24 and ud(u)u is factor of a word w' E {x, y } 6 . The word xxx, yyy, xyxyx, and yxyxy 
must not be a factor of w' because of LemmaQ] A computer program can check these finite possibilities 
with the result, that no words u and w' exist that fulfill these conditions for an antimorphic involution 6. 
So ,\u\ > 9 must hold and u contains at least one word x or y completely. We now look at the first u of 
the factor ud(u)u of w. Let wiw' 2 < s u with wi,W2 € {x,y}, W2 = w' 2 w'[ and \w' 2 \ < 5. We get Fig. [2] 
where ws,W4 G Without loss of generality, let wi =x = aabbc. Then d(u) and therefore W2W3W4 

contains the word 6 (aabbc) = 6(c) 6(b) 8(b) 6(a) 6(a) with length 5 as a factor. Hence we look at the 
following words: 

xx = aabbc aabbc 
xy = aabbc aaccb 
yx = aaccb aabbc 
yy = aaccb aaccb . 

Only xx contains 6 (c) 6(b) 6(b) 6 (a) 8 (a) for the antimorphic involution 6 with a^>b,b^ a, and c 1— >• c. 
Because of wi = x, the equation W2W3 = xx is a contradiction to Lemma [TJ The case W2W3W4 = yxx 
remains. Now there are five possibilities for the position of u, see Fig. [3] It is easy to check, that in all 
five cases 6(u) < p w^w^w^, respectively w'^w^wa, < p Q(u) doesn't hold. So our assumption, that there 
exists an antimorphic involution 8 and a word uGE + with ud(u)u is a factor of w, was wrong. The 
theorem follows with Lemma[5J □ 
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Figure 3: Illustration of possible positions of the factor u of w 



5 Complementary Patterns 

In this section, patterns similar to a 6(a) a are considered. 

For the next lemma we need a further definition. Let e G E* be a pattern consisting of variables of the 
form a and 6(a) and e' be the pattern that we get, when all variables a and 6(a) in e are switched. We 
call e' G E the 6 -complementary pattern of e. For example the 6 -complementary pattern of a a 6(a) j8 
is 6(a) 6(a) a 6(p). For this definition it doesn't matter if morphic or antimorphic involutions are 
examined. 

Lemma 8. Let e G E* be a pattern and e' G E be the 6 -complementary pattern of e. Then "V^(e) = 
V°(e')andV°(e) = V°(e'). 

Proof. First of all we show Y^(e) = "V^(e'). For better readability, we replace the variable a in the 
pattern e' by a' and 6(a) by 6(a'). We assume a word w G L a contains the pattern e for a morphic 
involution and a substitution of a by u G L + . Then w contains the pattern e' for the same morphic 
involution by substituting a' by 6(u). Symmetry reasons imply: 

It exists a morphic involution 6 so that w contains the pattern e. 
^> It exists a morphic involution 6' so that w contains the pattern e . 

By negation we get: 

The word w G L m avoids the pattern e. 
43- The word w£l ffl avoids the pattern e' . 

The equation V£ (e) = (e ! ) follows. The proof of (e) = (e') is identical. □ 

Note the following 0-free patterns; see ifTl . 

Observation 9. The patterns aa, aa(5, paa, aafia, afipa, aaf5{5, afiap, aafiaa, and aapafi 
are 1-unavoidable and 3-avoidable. 

Lemma 10. Let e G E* be a pattern, that contains the variables a and 6(a). Further, e contains no 
other variable of the form 6(y). Let e' be the pattern when all occurrences of 6(a) in e are replaced by 
a. The pattern e" obtained when all occurrences of 6(a) in e are replaced by a new variable j8. 
Then V(e')< T° (e) < V(e") and (e) < V(e"). 
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Proof. The relation f{e') < f^(e) holds, since the morphic 0-avoidance index considers all morphic 
involutions, including the identity mapping. Now say i / {e") = k, i.e., a word w 6 r ffl exists, that avoids 
the pattern e".Then this word also avoids the pattern e for all morphic and antimorphic involutions. 
Therefore the relations f° (<?) < t(e") and f/{e) < t(e") hold. □ 

Lemma 11. It holds that V® (a ad {a)) = y° (a a d (a)) =3. 

Proof. According to Observation[9]the equation y(aap) = 3 holds. LemmafTOlimplies f/(aa8(a)), 
f m e (aa0(a)) < 3. We show by contradiction, that it holds that f/(aa0(a))^2. The proof for the 
relation f m e (aa0(a)) 7^ 2 is analogous. Assuming a word wGl ffl with L= {a,b} exists that avoids the 
pattern a a 8 (a) for all antimorphic involutions. Then w contains neither aa nor bb as a factor. Without 
loss of generality w begins with the letter a. It follows that w = (ab) m . But w = {ab) 03 contains the pattern 
a a 6(a) for a = ab and the antimorphic involution defined by a (->■ b and b^ a. This is a contradiction 
to our assumption. Therefore y^(aad(a)) / 2 holds and analogously f m e (aafl(a)) / 2. We get 
r/(aae(a)) = f m e (aae(a))=3. □ 

Lemma 12. It holds that -f a 9 (e(a) aa) = y°(d(a) a a) = 3. 

Proof. The proof is analogous to the proof of Lemma [TT] □ 
Corollary 13. 

1. f£(e(a)a0(a)) = r°(d(a)ad{a)) = 3 by Theorem\6\and\7\ 

2. y^(d(a)d{a)a)=y :i 6 (d(a)d(a)a) = 3byLemma\ni 

3. y£{ad{a)d{a))=r°{ae{a)d(a))=3byLemma\n\ 
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